{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src='http://www-scf.usc.edu/~ghasemig/images/sharif.png' alt=\"SUT logo\" width=200 height=200 align=left class=\"saturate\" >\n",
    "\n",
    "<br>\n",
    "<font face=\"Times New Roman\">\n",
    "<div dir=ltr align=center>\n",
    "<font color=0F5298 size=7>\n",
    "    Introduction to Machine Learning <br>\n",
    "<font color=2565AE size=5>\n",
    "    Computer Engineering Department <br>\n",
    "    Fall 2022<br>\n",
    "<font color=3C99D size=5>\n",
    "    Homework 3: Practical - Neural Network <br>\n",
    "<font color=696880 size=4>\n",
    "    Javad Hezareh \n",
    "    \n",
    "    \n",
    "____\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Full Name : \n",
    "### Student Number : \n",
    "___"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Problem\n",
    "In this assignment our goal is to develop a framework for simple neural networks, multi layer perceptrons. We are going to use only `numpy` and no other packages to build our own classes and network."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "D_a46D-Gz3Rj"
   },
   "outputs": [],
   "source": [
    "###################################\n",
    "#  Do Not Add any other packages  #\n",
    "###################################\n",
    "\n",
    "import numpy as np\n",
    "import seaborn as sn\n",
    "import matplotlib.pyplot as plt\n",
    "import tqdm\n",
    "import copy\n",
    "from utils import *\n",
    "\n",
    "plt.style.use('ggplot')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Section 1: Modules implementation (65 Points)\n",
    "We are going to implement required modules for a neural net. Each of this modules must implement the neccessery functions, `_forward` and `backward`. In the following parts, we will implement `LinearLayer`, `ReLU` and `SoftMax` layers."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Layers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Linear Layer (10 Points)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "moGEgpV999cl"
   },
   "outputs": [],
   "source": [
    "class LinearLayer(Module):\n",
    "    \"\"\"\n",
    "    A linear layer module which calculate (Wx + b).\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, dim_in, dim_out, initializer, reg, alpha):\n",
    "        \"\"\"\n",
    "        Args:\n",
    "            - dim_in: input dimension,\n",
    "            - dim_out: output dimension,\n",
    "            - initializer: a function which get (dim_in, dim_out) and initialize\n",
    "                a [dim_in x dim_out] matrix,\n",
    "            - reg: L2-regularization flag\n",
    "            - alpha: L2-regularization coefficient\n",
    "        \"\"\"\n",
    "        self.dim_in = dim_in\n",
    "        self.dim_out = dim_out\n",
    "        self.params = {\n",
    "            #########################################\n",
    "            ##          Initialize parameters      ##\n",
    "            ##              Your Code              ##\n",
    "            #########################################\n",
    "            'W': None,\n",
    "            'b': None\n",
    "        }\n",
    "        self.grads = dict()\n",
    "        self.cache = dict()\n",
    "\n",
    "    def _forward(self, x):\n",
    "        \"\"\"\n",
    "        linear forward function, calculate Wx+b for a batch of data\n",
    "\n",
    "        Args:\n",
    "            x : a batch of data\n",
    "\n",
    "        Note:\n",
    "            you need to store some values in cache to be able to\n",
    "            calculate backward path.\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        y = None\n",
    "\n",
    "        return y\n",
    "\n",
    "    def backward(self, upstream):\n",
    "        \"\"\"\n",
    "        get upstream gradient and returns downstream gradient\n",
    "\n",
    "        Args:\n",
    "            upstream : upstream gradient of loss w.r.t module output\n",
    "\n",
    "        Note:\n",
    "            you need to calculate gradient of loss w.r.t module input\n",
    "            and parameters and store them in grads.\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        grad_b = None\n",
    "        grad_w = None\n",
    "        grad_x = None\n",
    "        grad_reg = None\n",
    "\n",
    "        self.grads = {\n",
    "            'W': grad_w,\n",
    "            'b': grad_b,\n",
    "            'x': grad_x,\n",
    "            'reg': grad_reg\n",
    "        }\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# sanity check, output must be from o(e-5)\n",
    "initializer = lambda x, y: np.random.normal(size=(y, x))\n",
    "linear = LinearLayer(5, 10, initializer, reg=True, alpha=0.001)\n",
    "check_gradient_linear(linear, h=0.00001)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### ReLU Layer (5 Points)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "RSsLumM29_eT"
   },
   "outputs": [],
   "source": [
    "class ReLU(Module):\n",
    "    \"\"\"\n",
    "    Rectified Linear Unit function\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self):\n",
    "        self.cache = dict()\n",
    "        self.grads = dict()\n",
    "\n",
    "    def _forward(self, x):\n",
    "        \"\"\"\n",
    "        applies relu function on x\n",
    "\n",
    "        Args:\n",
    "            x : a batch of data\n",
    "\n",
    "        Returns:\n",
    "            y : relu of input\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        y = None\n",
    "        return y\n",
    "\n",
    "    def backward(self, upstream):\n",
    "        \"\"\"\n",
    "        calculate and store gradient of loss w.r.t module input\n",
    "\n",
    "        Args:\n",
    "            upstream : gradient of loss w.r.t modele output\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        grad_x = None\n",
    "\n",
    "        self.grads['x'] = grad_x\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# sanity check - output must be from o(e-8)\n",
    "relu = ReLU()\n",
    "check_gradient_relu(relu)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### SoftMax Layer (15 Points)\n",
    "\n",
    "We could have a layer that calculate softmax for us. In other word, for input $x\\in\\mathcal{R}^N$ it would return $y\\in\\mathcal{R}^n$ where $y_i = \\frac{e^{x_i}}{\\sum e^{x_i}}$. But this method is not numerical stable because $e^{x_i}$ in this formulation can get very large easly and return `nan`. Instead of that we will implement a logarithmic version of softmax which instead of calculating $\\frac{e^{x_i}}{\\sum e^{x_i}}$, we will calculate $\\log\\left(\\frac{e^{x_i}}{\\sum e^{x_i}}\\right) = x_i - \\log\\sum e^{x_i}$. In order to calculate second term you can use `np.logaddexp` but this function only works on two input. For more than two input, fill in the following function to be able to calculate log sum exp of an array of shape (b,n). `axis=1` means sum over columns and `axis=0` sum over rows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def logsumexp(array, axis=1):\n",
    "    \"\"\"\n",
    "    calculate log(sum(exp(array))) using np.logaddexp\n",
    "\n",
    "    Args:\n",
    "        array : input array\n",
    "        axis : reduce axis, 1 means columns and 0 means rows\n",
    "    \"\"\"\n",
    "    assert len(array) >= 2\n",
    "    #########################################\n",
    "    ##              Your Code              ##\n",
    "    #########################################\n",
    "    pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "R4v5_UBB-BCK"
   },
   "outputs": [],
   "source": [
    "class LogSoftMax(Module):\n",
    "    def __init__(self):\n",
    "        self.cache = dict()\n",
    "        self.grads = dict()\n",
    "\n",
    "    def _forward(self, x):\n",
    "        \"\"\"\n",
    "        get x and calculate softmax of that.\n",
    "\n",
    "        Args:\n",
    "            x : batch of data with shape (b,m)\n",
    "\n",
    "        Returns:\n",
    "            y : log softmax of x with shape (b,m)\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        y = None\n",
    "        \n",
    "        return y\n",
    "\n",
    "    def backward(self, upstream):\n",
    "        \"\"\"\n",
    "        calculate gradient of loss w.r.t module input and save that in grads.\n",
    "\n",
    "        Args:\n",
    "            upstream : gradient of loss w.r.t module output with sahpe (b,m)\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        grad_x = None\n",
    "        \n",
    "        self.grads['x'] = grad_x\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# sanity check, output must be from o(e-7)\n",
    "sm = LogSoftMax()\n",
    "check_gradient_softmax(sm)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model (10 Points)\n",
    "We need a model class which gathers our layers togather and performs forward and backward on all of them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MLPModel(Module):\n",
    "    \"\"\"\n",
    "    A multilayer neural network model\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, layers):\n",
    "        \"\"\"\n",
    "        Args:\n",
    "            layers : list of model layers\n",
    "        \"\"\"\n",
    "        self.layers = layers\n",
    "\n",
    "    def _forward(self, x):\n",
    "        \"\"\"\n",
    "        Perform forward on x\n",
    "\n",
    "        Args:\n",
    "            x : a batch of data\n",
    "\n",
    "        Returns:\n",
    "            o : model output\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        pass\n",
    "    \n",
    "    def backward(self, upstream):\n",
    "        \"\"\"\n",
    "        Perform backward path on whole model\n",
    "\n",
    "        Args:\n",
    "            upstream : gradient of loss w.r.t model output\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        pass\n",
    "\n",
    "    def get_parameters(self):\n",
    "        \"\"\"\n",
    "        Returns:\n",
    "            parametric_layers : all layers of model which have parameter\n",
    "        \"\"\"\n",
    "        pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loss functions (10 Points)\n",
    "We need to implement loss functions to be able to train our network. We will implement CrossEntropy loss function. But notice that we have implemented `LogSoftMax` in logarithmic way so input of the following class will be logarithm of probabilities. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "HIBN2fmBpYGG"
   },
   "outputs": [],
   "source": [
    "class CrossEntropyLoss(Module):\n",
    "    def __init__(self, mean=False):\n",
    "        self.mean = mean\n",
    "        self.cache = dict()\n",
    "        self.grads = dict()\n",
    "\n",
    "    def _forward(self, logprobs, targets):\n",
    "        \"\"\"\n",
    "        Calculate cross entropy of inputs.\n",
    "\n",
    "        Args:\n",
    "            probs : matrix of probabilities with shape (b,n)\n",
    "            targets : list of samples classes with shape (b,)\n",
    "\n",
    "        Returns:\n",
    "            y : cross entropy loss\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        y = None\n",
    "        \n",
    "        return y\n",
    "\n",
    "    def backward(self, upstream):\n",
    "        \"\"\"\n",
    "        Calculate gradient of loss w.r.t module input and save them in grads.\n",
    "\n",
    "        Args:\n",
    "            upstream : gradient of loss w.r.t module output (loss)\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        grad_scores = None\n",
    "        \n",
    "        self.grads['x'] = grad_scores"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# check gradient, output must be from o(e-10)\n",
    "ce = CrossEntropyLoss()\n",
    "check_gradient_ce(ce, h=0.0001)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Optimization (15 Points)\n",
    "\n",
    "Now that we have our network and loss function, we need to update model paremeters. We can do so by using `Optimizer` class that perform updating rule on model parameters. You need to implement `sgd` and `momentum` strategy for this optimizer. Becarefull to consider regularization update for linear units that require regularization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "pDROgraatB0p"
   },
   "outputs": [],
   "source": [
    "class Optimizer():\n",
    "    \"\"\"\n",
    "    Our main optimization class.\n",
    "    \n",
    "    You can add arguments to _sgd and _momentum function if you need to do so, and\n",
    "    pass this arguments to step function when using optimizer. Don't change __init__\n",
    "    or step function.\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, layers, strategy, lr):\n",
    "        \"\"\"\n",
    "        save layers here in order to update their parameters later.\n",
    "\n",
    "        Args:\n",
    "            layers : model layers (those that we want to update their parameters)\n",
    "            strategy : optimization strategy\n",
    "            lr : learning rate\n",
    "        \"\"\"\n",
    "        self.layers = layers\n",
    "        self.strategy = strategy\n",
    "        self.lr = lr\n",
    "        self.strategies = {\n",
    "            'sgd': self._sgd,\n",
    "            'momentum': self._momentum,\n",
    "        }\n",
    "\n",
    "    def step(self, *args):\n",
    "        \"\"\"\n",
    "        Perform updating strategy on all layers paramters.\n",
    "        \"\"\"\n",
    "        self.strategies[self.strategy](*args)\n",
    "\n",
    "    def _sgd(self):\n",
    "        \"\"\"\n",
    "        Perform sgd update on all parameters of layers\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        pass\n",
    "    \n",
    "    def _momentum(self):\n",
    "        \"\"\"\n",
    "        Perform momentum update on all parameters of layers\n",
    "        \"\"\"\n",
    "        #########################################\n",
    "        ##              Your Code              ##\n",
    "        #########################################\n",
    "        pass\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Section 2: CIFAR-10 Classification (35 Points)\n",
    "\n",
    "Now that we can build a neural network we want to solve CIFAR-10 classification problem. This dataset consists of 60000 $32 \\times 32$ coloured images in 10 classes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data preparation (5 Points)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#################################################\n",
    "##      Run this cell to download dataset      ##\n",
    "##         the dataset is about 150 MB         ##\n",
    "#################################################\n",
    "!./cifar10_downloader.bash"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#############################################\n",
    "##      Run this cell to load dataset      ##\n",
    "#############################################\n",
    "data = load_dataset(train_num=4000, test_num=1000)\n",
    "\n",
    "for k in data.keys():\n",
    "    print(f'{k}: {data[k].shape}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "##############################################\n",
    "##      Split train set to train/val        ##\n",
    "################[Your Code]###################\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "##############################################\n",
    "for k in data.keys():\n",
    "    print(f'{k}: {data[k].shape}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "##################################################\n",
    "##      Visualize 5 samples from each class     ##\n",
    "##################[Your Code]#####################\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#####################################################\n",
    "##             Normalize and flatten X             ##\n",
    "####################[Your Code]######################\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "#####################################################\n",
    "for k in data.keys():\n",
    "    print(f'{k}: {data[k].shape}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train and Test Model (25 Points)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Batch Sampler\n",
    "We need to sample bathces from our dataset to train model. Complete the following class to have a random sampler."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class RandomSampler(object):\n",
    "    def __init__(self, batch_size, dataset, type):\n",
    "        \"\"\"\n",
    "        Args:\n",
    "            batch_size : sampler batch size\n",
    "            dataset : dataset we want to get batch from that\n",
    "            type : one of {'train', 'test', 'val'}\n",
    "        \"\"\"\n",
    "        self.batch_size = batch_size\n",
    "        self.dataset = dataset\n",
    "        self.x_key = f'X_{type}'\n",
    "        self.y_key = f'Y_{type}'\n",
    "        self.indices = None\n",
    "        self.num_batches = None\n",
    "        ################################################################\n",
    "        ##       Build batches indices and store them in indices      ##\n",
    "        ##          Also store number of batches in num_batches       ##\n",
    "        ##                          Your Code                         ##\n",
    "        ################################################################\n",
    "\n",
    "    def __len__(self):\n",
    "        assert type(self.num_batches) == int\n",
    "        return self.num_batches\n",
    "\n",
    "    def __iter__(self):\n",
    "        \"\"\"\n",
    "        This function call when we iterate an object of this class and\n",
    "        yields one batch on each call.\n",
    "\n",
    "        Yields:\n",
    "            (x, y) : tuple of bathces of x and y\n",
    "        \"\"\"\n",
    "        for idx in self.indices:\n",
    "            x = self.dataset[self.x_key][idx]\n",
    "            y = self.dataset[self.y_key][idx]\n",
    "            yield x, y\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fill the following functions to update a confusion matrix and calculate f1 score for a confusion matrix. For multi class f1 score read [here](https://towardsdatascience.com/multi-class-metrics-made-simple-part-ii-the-f1-score-ebe8b2c2ca1)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def update_confusion_matrix(conf_matrix, preds, reals):\n",
    "    \"\"\"\n",
    "    Updates confusion matrix\n",
    "\n",
    "    Args:\n",
    "        conf_matrix : input confusion matrix\n",
    "        preds : array of predicted labels\n",
    "        reals : array of real labels\n",
    "\n",
    "    Returns:\n",
    "        conf_matrix : updated confusion matrix\n",
    "    \"\"\"\n",
    "    #################################\n",
    "    ##          Your Code          ##\n",
    "    #################################\n",
    "    return conf_matrix\n",
    "\n",
    "\n",
    "def f1_score(confusion_matrix):\n",
    "    \"\"\"\n",
    "    calculate macro f1 score from given confusion matrix\n",
    "\n",
    "    Args:\n",
    "        confusion_matrix : given confusion matrix\n",
    "        \n",
    "    Returns:\n",
    "        f1 : macro f1 score\n",
    "    \"\"\"\n",
    "    #################################\n",
    "    ##          Your Code          ##\n",
    "    #################################\n",
    "    f1_score = None\n",
    "    \n",
    "    \n",
    "    return f1_score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define Model\n",
    "Define an MLP model to solve classification problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "###############################################\n",
    "##             Define your model             ##\n",
    "##     use a good initializer for layers     ##\n",
    "###############################################\n",
    "\n",
    "model = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#########################################\n",
    "##          Hyper parameters           ##\n",
    "#########################################\n",
    "\n",
    "n_epochs = None\n",
    "batch_size = None\n",
    "lr = None\n",
    "reg_coeff = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "##################################################\n",
    "##      Define optimizer, loss and sampler      ##\n",
    "##################################################\n",
    "\n",
    "optimizer = None\n",
    "criterion = None\n",
    "train_sampler = None\n",
    "val_sampler = None\n",
    "test_sampler = None"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Train Model\n",
    "\n",
    "Fill in the below cell to train the model. Store each epoch loss, accuracy and f1-score. Use f1-score to choose best epoch.\n",
    "\n",
    "**Note1**: To do backpropagation you need to first call `backward` function of criterion with 1 as its argument to have gradient of loss w.r.t output of this module and then using model `backward` function with `criterion.grads['x']` argument.\n",
    "\n",
    "**Note2**: You can ignore regularization term in your total loss value and just use criterion, but you must consider that during updating."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#########################################\n",
    "##      Train and Validation loop      ##\n",
    "#########################################\n",
    "train_losses, val_losses = [], []\n",
    "train_accs, val_accs = [], []\n",
    "train_f1, val_f1 = [], []\n",
    "best_model = None\n",
    "best_f1 = 0\n",
    "\n",
    "for epoch in range(n_epochs):\n",
    "    \n",
    "    # Train Phase\n",
    "    total_loss = 0\n",
    "    N = 0\n",
    "    confusion = np.zeros((10, 10))\n",
    "    with tqdm.tqdm(enumerate(train_sampler), total=len(train_sampler)) as pbar:\n",
    "        for i, (x, y) in pbar:\n",
    "            #################################\n",
    "            ##          Your Code          ##\n",
    "            #################################\n",
    "            \n",
    "            acc = None\n",
    "            f1 = None\n",
    "            pbar.set_description(f'Train {epoch} | Loss:{total_loss/N:.2e} | Acc: {acc:.2f}| F1: {f1:.2f}|')\n",
    "    \n",
    "    # save epoch metrics for train phase\n",
    "    train_losses.append(total_loss/N)\n",
    "    train_accs.append(acc)\n",
    "    train_f1.append(f1)\n",
    "    \n",
    "\n",
    "    # Validation Phase\n",
    "    total_loss = 0\n",
    "    N = 0\n",
    "    confusion = np.zeros((10, 10))\n",
    "    with tqdm.tqdm(enumerate(val_sampler), total=len(val_sampler)) as pbar:\n",
    "        for i, (x, y) in pbar:\n",
    "            #################################\n",
    "            ##          Your Code          ##\n",
    "            #################################\n",
    "            \n",
    "            acc = None\n",
    "            f1 = None\n",
    "            pbar.set_description(f'Val   {epoch} | Loss:{total_loss/N:.2e} | Acc: {acc:.2f}| F1: {f1:.2f}|')\n",
    "    \n",
    "    # save epoch metrics for validation phase\n",
    "    val_losses.append(total_loss/N)\n",
    "    val_accs.append(acc)\n",
    "    val_f1.append(f1)\n",
    "    \n",
    "    \n",
    "    #################################\n",
    "    ##       update best model     ##\n",
    "    ##          Your Code          ##\n",
    "    #################################\n",
    "    \n",
    "    \n",
    "    print(f'----------------------------[Epoch{epoch+1} finished!]----------------------------')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "######################################################################\n",
    "##      Plot train and validation loss, accuracy and f1 graphs      ##\n",
    "######################################################################\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "############################################################\n",
    "##                  Test your best model                  ##\n",
    "##          Report loss, accuracy and f1 metrics          ##\n",
    "##      Also plot the confusion matrix for test data      ##\n",
    "############################################################\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Weights Visualization (5 Points)\n",
    "\n",
    "For the last part we want to visualize weights matrix of the first layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "##########################################################\n",
    "##          Visualize n of first layer weights          ##\n",
    "##          First reshape them to (32, 32, 3)           ##\n",
    "##########################################################\n",
    "n = 5\n"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  },
  "vscode": {
   "interpreter": {
    "hash": "81794d4967e6c3204c66dcd87b604927b115b27c00565d3d43f05ba2f3a2cb0d"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
